A corpus-based connectionist architecture for large-scale natural language parsing
نویسندگان
چکیده
We describe a deterministic shift-reduce parsing model that combines the advantages of connectionism with those of traditional symbolic models for parsing realistic sub-domains of natural language. It is a modular system that learns to annotate natural language texts with syntactic structure. The parser acquires its linguistic knowledge directly from pre-parsed sentence examples extracted from an annotated corpus. The connectionist modules enable the automatic learning of linguistic constraints and provide a distributed representation of linguistic information that exhibits tolerance to grammatical variation. The inputs and outputs of the connectionist modules represent symbolic information which can be easily manipulated and interpreted and provide the basis for organising the parse. Performance is evaluated using labelled precision and recall. (For a test set of 4,128 words, precision and recall of 75% and 69% respectively were achieved). The work presented represents a significant step towards demonstrating that broad coverage parsing of natural language can be achieved with simple hybrid connectionist architectures which approximate shift-reduce parsing behaviours. Crucially, the model is adaptable to the grammatical framework of the training corpus used and so is not predisposed to a particular grammatical formalism.
منابع مشابه
Description Based Parsing in a Connectionist Network
Description Based Parsing in a Connectionist Network James Brinton Henderson Mitchell Marcus Recent developments in connectionist architectures for symbolic computation have made it possible to investigate parsing in a connectionist network while still taking advantage of the large body of work on parsing in symbolic frameworks. This dissertation investigates syntactic parsing in the temporal s...
متن کاملLarge Scale Corpus Analysis and Recent Applications
Recent progress of corpus and machine learning-based natural language processing methodologies have made it possible to handle large scale corpus with a quite high accuracy. The speaker is now involved in a project for constructing a large scale contemporary Japanese balanced corpus, aiming at constructing automatic annotation tools on various levels of natural language analyses. I will first i...
متن کاملEnhancing First-Pass Attachment Prediction
This paper explores the convergence between cognitive modeling and engineering solutions to the parsing problem in NLP. Natural language presents many sources of ambiguity, and several theories of human parsing claim that ambiguity is resolved by using past (linguistic) experience. In this paper we analyze and refine a connectionist paradigm (Recursive Neural Networks) capable of processing acy...
متن کاملLarge-scale connectionist natural language parsing using lexical semantic and syntactic knowledge
..............................................................................................................................................I ACKNOWLEDGEMENTS .................................................................................................................. II TABLE OF CONTENTS .......................................................................................................
متن کاملIntegrating connectionist, statistical and symbolic approaches for continuous spoken Korean processing
This paper presents a multi-strategic and hybrid approach for large-scale integrated speech and natural language processing, employing connectionist, statistical and symbolic techniques. The developed spoken Korean processing engine (SKOPE) integrates connectionist TDNN-based phoneme recognition technique with statistical Viterbi-based lexical decoding and symbolic morphological/phonological an...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Connect. Sci.
دوره 14 شماره
صفحات -
تاریخ انتشار 2002